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Leakage errors occur when a quantum system leaves the two-level qubit subspace. Reducing these 
errors is critically important for quantum error correction to be viable. To quantify leakage errors, 
we use randomized benchmarking in conjunction with measurement of the leakage population. We 
characterize single qubit gates in a superconducting qubit, and by refining our use of Derivative 
Reduction by Adiabatic Gate (DRAG) pulse shaping along with detuning of the pulses, we obtain 
gate errors consistently below 10“^ and leakage rates at the 10“^ level. With the control optimized, 
we find that a significant portion of the remaining leakage is due to incoherent heating of the qubit. 


Accurate manipulation of the states in a quantum 
two-level system (qubit) is a key requirement for build¬ 
ing a fault tolerant quantum processor [1]. However, 
many physical quantum systems such as quantum dots 
[2] and superconducting qubits [3] have multiple levels, 
from which two levels are chosen to form the compu¬ 
tational subspace. The presence of non-computational 
levels leads to two types of errors: leakage errors where 
the quantum state populates non-computational levels, 
and phase errors due to coupling of computational and 
non-computational levels when driven by control fields 
[4, 5]. Previous experimental work [6, 7] on supercon¬ 
ducting qubits has focused on reducing phase errors, be¬ 
cause they were the dominant source of total gate in¬ 
fidelity. Indeed, the suppression of phase errors using 
Derivative Reduction by Adiabatic Gate (DRAG) pulse 
shaping [8] has helped push single qubit fidelity in super¬ 
conducting qubits over 99.9%, nominally satisfying one of 
the requirements for realizing quantum error correction 
(QEC) [9, 10]. 

However, total fidelity is not the only metric that de¬ 
termines the viability of QEC because certain errors are 
more deleterious than others. Specifically, leakage er¬ 
rors are known to be highly detrimental for error correct¬ 
ing codes such as the surface code, because interactions 
with a qubit in a leakage state have a randomizing effect 
on the interacting qubits [11]. Moreover, leakage states 
can be as long-lived as the qubit states, leading to time- 
correlated errors which further degrade performance [12]. 
These concepts were recently demonstrated in a 9 qubit 
repetition code [13], where single leakage events persisted 
for multiple error detection cycles and propagated errors 
to neighboring qubits. Understanding and reducing leak¬ 
age is of critical importance for realizing QEC. 

In this Letter, we characterize single qubit leakage er¬ 
rors in a superconducting qubit. To estimate leakage er¬ 
rors, we use randomized benchmarking (RB) [14, 15] in 


conjunction with measurements of leakage state popula¬ 
tions. Using this method, we show that previous experi¬ 
mental realizations of DRAG pulse shaping have a trade¬ 
off between total fidelity and leakage errors. We overcome 
this tradeoff using additional pulse shaping, and obtain 
gates that have both state of the art fidelity and low leak¬ 
age. Additionally, we use RB to measure the dependence 
of leakage on pulse length. 

Our experiment uses Clifford based randomized bench¬ 
marking [15], which is typically used to characterize over¬ 
all gate fidelity. In Clifford based RB, we apply a ran¬ 
dom sequence of gates chosen from the single qubit Clif¬ 
ford group, which is the group of rotations that map the 
six axial Bloch states to each other. We then append 
a recovery Clifford gate to the end of the sequence such 
that the complete sequence is ideally the identity opera¬ 
tion. Thus, the fidelity of a sequence is the probability 
of mapping |0) to |0). By randomly choosing the gates 
in each sequence, phase and amplitude errors accumu¬ 
late incoherently, which leads to exponential decay of the 
sequence fidelity with increasing sequence length. The 
crux of our protocol is that randomization also accumu¬ 
lates leakage errors incoherently [16], such that over many 
gates we build up leakage populations to a level that can 
be measured using current techniques. We note that leak¬ 
age errors as discussed here differ from irreversible loss 
of the qubit; RB in the presence of loss was previously 
discussed in Ref. [17]. 

Eor our testbed we use a single Xmon transmon qubit 
[18, 19] (Qy) from the 9 qubit chain described in Ref. [13]. 
The transmon has a weakly anharmonic potential, shown 
in Eig. 1(a), which supports a ladder of energy levels. The 
two lowest levels form our qubit, and the primary non- 
computational level is the |2) state. Leakage errors arise 
when the qubit state is directly excited to the |2) state, 
while phase errors occur due to AC Stark shifting of the 
le^2 transition [4]. 
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We operate the qubit at a frequency /lo of around 
5.3 GHz, and the anharmonicity A = CJ 21 — ^10 is 
27r X —212 MHz. Microwave (XY) control is achieved 
using a capacitively coupled transmission line driven at 
the qubit frequency. We generate control pulses using a 
custom arbitrary waveform generator, and the pulses are 
shaped with a cosine envelope. We measure the qubit 
state using a dispersive readout scheme [ 20 ] in conjunc¬ 
tion with a bandpass filter [ 21 ] and a wideband paramet¬ 
ric amplifier [22]. This setup allows us to discriminate 
the 12 ) state in addition to the two computational levels 
with high fidelity [16]. The Ti of the device at the op¬ 
erating frequency is 22 jus, while a Ramsey experiment 
shows two characteristic decay times [23], an exponential 
decay time of 8 jus and a Gaussian decay time of 
1.8 /us. 

To illustrate our novel use of RB, we begin by mea¬ 
suring how DRAG suppresses leakage and phase errors. 
We use the simplified version of DRAG described in 
Refs. [ 6 , 7]. Given a control envelope we add the 

time derivative Cl{t) to the quadrature component: 

Q'{t) = Q{t) — (1) 

where a is a weighting parameter. Fourier analysis [4, 24] 
shows that the DRAG correction suppresses the spectral 
weight of the control pulse at the 1 2 transition if 

a = 1.0, which minimizes leakage errors. However, the 
optimal value to compensate the AC Stark shift and cor¬ 
rect for phase errors is a = 0.5 [4, 6 ]. 

We confirm these concepts by performing Clifford 
based RB using 10 ns microwave pulses shaped with 
three different values of a (0, 0.5, and 1.0), as shown in 
Fig. 1(b). We combine up to three pulses to form a single 
Clifford gate; on average each Clifford contains 1.5 7 r/ 2 - 
pulses and 0.375 7 r-pulses, resulting in an average gate 
length of 18.75 ns. Figure 1(c) shows sequence fidelity 
decay curves for the three values of a. As expected, us¬ 
ing (a = 0.5 results in higher fidelities than a = 0.0 or 
a = 1.0. We can quantify this improvement from the 
characteristic scale of the decay p, obtained by fitting to 
Ap^ + B where A and B encapsulate state preparation 
and measurement errors. We then estimate the error per 
Clifford as rcufford = (1 ~p)/2 [15]. For a = 0.5, we 
obtain an error per Clifford of 9.6 ± 0.1 x 10 “^, while for 
a = 0.0 and a = 1.0 we obtain errors of 6.3 ± 0.2 x 10“^ 
and 1.20 ± 0.01 x 10 “^ per Clifford, respectively. 

Simultaneously, we characterize leakage errors in our 
gateset from the dynamics of the | 2 ) state measured while 
performing RB, as shown in Fig. 1(d). For all three value 
of a, the 12 ) state population shows an exponential ap¬ 
proach to a saturation population. Without correction, 
this saturation population is significant at about 10 %, 
but decreases by about a factor of 3 for a = 0.5 and by 
a factor of 10 for a = 1.0. To quantify the leakage rate 
per Clifford, we fit the | 2 ) state dynamics to a simple 
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Figure 1: (a) Weakly anharmonic potential of a transmon. 
When driving |0) to |1), direct excitation to |2) (red arrow) 
causes leakage errors, while AC Stark repulsion of the le^2 
transition (dashed lines) leads to phase errors, (b) The simple 
DRAG correction, which adds the derivative of the envelope 
to the quadrature component of the envelope. Three differ¬ 
ent DRAG weightings (o) are shown, (c) Exponential decay 
of sequence fidelity from randomized benchmarking, showing 
data for the three values of a. Each point is the average of 
75 different random sequences. Eidelity is highest for a = 0.5 
(d) 12) state population vs sequence length, showing the accu¬ 
mulation of leakage with sequence length. Leakage is lowest 
for a = 1.0. 


rate equation that takes into account leakage from the 
computational subspace into the | 2 ) state and decay of 
the 12 ) state back into the subspace [16]. 


P| 2 >(m) =Poo (1 - e +Poe (2) 

r = 7t + 74. Poo = 7t/r (3) 

where p\ 2 ){'m) is the | 2 ) state population as a function 
of sequence length m, 7 ^ and 7 ^ are the leakage and 
decay rates per Clifford, and po is the initial | 2 ) state 
population. Using Eq. (2), we extract leakage rates of 
3.92±0.08xl0-^, 1.02±0.02xl0-^, and 2.18±0.08x 10“^ 
for a =0, 0.5 and 1.0. 

The results from RB confirm the theory behind simple 
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Figure 2: (a) Control envelopes with simple DRAG with 

(right) and without (left) detuning of the pulse. The detuning 
is exaggerated for illustration, (b) We sweep over the detun¬ 
ing Sf while performing the pseudo-identity sequence shown 
in the inset. The sequence maps back to |0) when the detun¬ 
ing is optimized. Multiple applications of the pulse sequence 
increases the sensitivity of the measurement, (c) Quantum 
state trajectories plotted on projections of the Bloch sphere, 
with (bottom) and without (top) optimal detuning. The data 
is obtained by performing quantum state tomography (QST) 
after applying a variable X rotation, with the rotation angle 
ranging from 0 to tt. 

DRAG: we can minimize either phase error or leakage 
error, but not both. To simultaneously optimize for both 
gate fidelity and leakage performance, we would like to 
minimize leakage using simple DRAG, then separately 
compensate the AC Stark shift. In the original DRAG 
theory, the Stark shift was compensated using a time de¬ 
pendent detuning of the qubit [8]. However, as was pre¬ 
viously noted in Refs. [4, 5], a constant detuning should 
also be able to compensate the AC Stark shift. Given 
an envelope Q', which in general can have a quadrature 
correction, we generate a new envelope 

e"(t) (4) 

where Sf is the detuning of the pulse from the qubit fre¬ 
quency. We also redefine the anharmonicity parameter 
in Eq. (1) to be A = 0021 — (<^10 + 27r(5/), so that leak¬ 
age suppression still occurs at the 1 2 frequency. An 

example of a detuned pulse is shown in Fig. 2(a). 

To optimize the detuning parameter Sf, we sweep 
the detuning of a 7r-pulse while performing the psuedo- 
identity operation of a 7r-pulse followed by a —7r-pulse 
along the same rotation axis [6, 25]. As shown in 
Fig. 2(b), the pulse detuning is optimized when the |0) 


Figure 3: Total gate fidelity and leakage rates versus DRAG 
weighting a, measured using RB. (a) Without using pulse 
detunings, we require different values of a to minimize overall 
error versus leakage errors, (b) By optimizing our pulses using 
detunings, we obtain high fidelity for any a, and are free to 
choose a to minimize leakage. 


state population is maximized, and the psuedo-identity 
can be applied multiple times to increase the resolution 
of the measurement. To verify that the detuning has sup¬ 
pressed phase errors, we perform quantum state tomog¬ 
raphy after applying a control pulse to our qubit while 
ramping the amplitude of the pulse, as shown in Fig. 2(c). 
Without detuning, the Bloch vector never reaches the 
pole, while the behavior is much closer to ideal when the 
detuning is optimized. 

We now explore in more detail the dependence of fi¬ 
delity and leakage on a. In Fig. 3, we show parameters 
extracted from RB with 10 ns pulses while varying a be¬ 
tween 0.0 and 1.5. Without detuning the pulses, we find 
the minimum error per Clifford to be 7.9 ±3 x 10“^ when 
a = 0.4. We note that this is a deviation from the ex¬ 
pected optimal value of a = 0.5; we attribute this devi¬ 
ation to distortions of the pulse between the waveform 
generator and the qubit [25]. Away from the optimal a, 
the error increases rapidly. 

Next, we optimize the detuning of the pulses for each 
value of a using the method described in Fig. 2. We find 
that when using tt and 7r/2 pulses with the same length, 
using the same detuning for both types of pulses yields 
the best results. After calibrating the detuning, we recali¬ 
brate the pulse amplitudes, then run a short Nelder-Mead 
optimization on the RB fidelity to get final adjustments 
to pulse parameters [26]. With these optimizations, we 
find that the average error per Clifford for all values of a 
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versus pulse length, with a = 0.0 and a = 1.1. The dashed 
line is the lower bound on leakage calculated from the heating 
rate, (b) Heating of the qubit from |1) to |2). We prepare the 
qubit in |1), wait for time t, then measure the qubit state. 
Inset: The dynamics of all three states, primarily showing Ti 
decay of |1) to |0). Main figure: Zoom in of the |2) state 
dynamics, showing an increase in population due to heating 
before relaxing back to zero. The data has been corrected for 
readout visibility. The dashed line is a rate equation fit, from 
which we extract the heating rate plotted in (a). 


to be 9.1 X 10“^, with a standard deviation of 1 x 10“^. 
In other words, we can tune up high fidelity gates for any 
value of a. 

With gate fidelity now independent of a, we are free to 
implement DRAG solely to minimize leakage. Without 
detuning, the minimum leakage rate is 1.82 ±0.07 x 10“^ 
for a = 1.1. After detuning the pulses for optimal fi¬ 
delity, we see shifts in the leakage rates. For a > 0.4, we 
detune the pulses towards the le^2 transition [16] which 
tends to increase the leakage rate. Nevertheless, we can 
still suppress leakage to the same level as the undetuned 
pulses by increasing a to 1.4. Using these parameters, 
we achieve both high fidelity (8.7 ± 0.4 x 10“^ error per 
Clifford) and low leakage (1.2 ± 0.1 x 10“^) [16]. 

Having characterized 10 ns pulses in detail, we now ex¬ 
amine the dependence of leakage on pulse length. As 
noted in Fig. 3, pulse detuning can affect the leakage 
rate; for simplicity we set the detuning to zero for the 
following measurements. We initially set a = 0.0 and 
measure the leakage rate while varying the length of our 
pulses between 8 ns and 50 ns and calibrating the pulse 
amplitudes accordingly. We then repeat this measure¬ 
ment with a = 1.1 where we previously found leakage 
to be suppressed in Fig. 3(a). The results are shown in 


Fig. 4(a). For short pulses, we observe that the leakage 
rate decreases exponentially with increasing pulse length, 
and that the DRAG correction generally suppresses leak¬ 
age by an order of magnitude or more. However, as the 
pulse length increases past 15ns, the leakage rate begins 
to level off and even begin to increase. Furthermore, the 
effect of DRAG is no longer distinguishable for pulses 
longer than 20 ns. These results suggest that for long 
pulses, leakage is the result of incoherent processes such 
as thermal excitations or noise at the le^2 transition, 
rather than coherent processes such as control errors. 

To measure the incoherent leakage rate, we prepare 
the qubit in the |1) state and measure the dynamics of 
the three qubit states, as shown in Fig. 4(b). We see 
that the |2) state population initially rises over 20//s, 
corresponding to heating from |1) to |2). Then, the |2) 
population slowly decays to zero as both excited states 
relax due to Ti processes. We model the |2) population 
using a rate equation with three rates: decay from |2) 
to |1), decay from |1) to |0), and heating from |1) to 
12). We ignore nonsequential transitions since they are 
suppressed in the nearly harmonic transmon potential 

[27] , as well as heating from |0) to |1) since we assume the 
initial state is 11). We extract the two decay rates from Ti 
measurements, which give = 22 /is and = 18 /rs 

[28] . The remaining parameter to fit is the 1^2 heating 
rate, which we find to be 1/(2.2 ms) [16]. 

We convert this heating rate to a leakage rate per Clif¬ 
ford using the prescription in Ref. [23]. The resulting 
lower bound on leakage due to heating is shown in the 
dashed line in Fig. 4(a). For pulses longer than 15 ns, 
we find that the leakage rate is within a factor of 2 of 
this lower bound, confirming that even at relatively short 
timescales, we are being limited by incoherent processes. 
We note that the heating rate and Ti decay rate are con¬ 
sistent with an equilibrium population of 0.8% for the |1) 
state [16]. In other works, equilibrium populations closer 
to 0.1% have been achieved [29], suggesting that incoher¬ 
ent leakage can be reduced through improved thermaliza- 
tion. 

In conclusion, we have used single qubit randomized 
benchmarking to study leakage errors in a superconduct¬ 
ing qubit. Using RB, we show that simple DRAG correc¬ 
tion alone cannot minimize leakage and total gate error 
simultaneously, but by detuning our pulses, we obtain 
gates with both high fidelity and low leakage. We also 
measured the dependence of leakage on pulse length, and 
found that heating of the qubit is a significant source of 
leakage in our system. Because RB is platform indepen¬ 
dent, this method should be applicable to other systems 
provided they have high fidelity measurement of their 
leakage states. This method should also be extendable 
to two-qubit gates, where entangling interactions can be 
a significant source of leakage [4]. 
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RATE EQUATION FOR |2) STATE POPULATION 

In this section we discuss the rate equation which de¬ 
scribes the 12 ) state population in the randomized bench¬ 
marking (RB) procedure. 

Neglecting the population of the |3) state and higher 
levels, it is natural to describe (phenomenologically) the 
average population p\ 2 ){'m) of the state | 2 ) after m Clif¬ 
fords using the evolution equation 

P|2>(m + 1) =P|2)M+7t[l-P|2>M]-74,p|2>(m), (1) 

where 7 ^ is the probability of the | 2 ) state excitation per 
Clifford, averaged over Cliffords and also over the initial 
state in the qubit subspace, while 7 ^ is the probability 
of returning from the state | 2 ) to the qubit subspace, 
averaged over Cliffords. We emphasize that Eq. ( 1 ) would 
be invalid for a particular RB sequence, but we apply it 
only assuming averaging over the RB sequences: P| 2 )(^): 
7 >^, and 74 , are all the averaged values. So far we have 
introduced Eq. (1) phenomenologically; we will discuss 
the applicability of this equation later. 

The solution to Eq. (1) is 

P| 2 >(m) = C(l-r)™+poo, Poo = Y^ r = 71^+74,, ( 2 ) 

where C is a constant, determined by the initial condi¬ 
tion, C = P| 2 )( 0 ) — Poo- III fhe case F <C 1 this solution 
can be replaced with 

P|2>(H = b|2>(0)-Poo] e”’"’"+P 00 , (3) 

that obviously corresponds to the standard rate equation 
dp\ 2 }{m}/dm = ^^[1 - p\ 2 ){m)] - 74 ,^ 12 )M, (4) 

to which Eq. (1) reduces when m is considered as a 
quasicontinuous variable (m ^ 1). Thus, m plays the 
role of the dimensionless time, while 7 ^ and 7 ^ are the 
excitation and relaxation rates in this dimensionless time. 
Note that if p\ 2 ) (0) = 0, then Eq. (3) becomes p\ 2 ) {tti) = 

Poo{l - 

Also note that if observed probabilities p\ 2 ) are dif¬ 
ferent from actual probabilities p\ 2 ) due to measurement 


infidelity in a linear way, p\ 2 ){'m) = Ap| 2 )(m) B[1 — 

P| 2 )(m)] (here A ^ 1 is the fidelity of the state | 2 ) mea¬ 
surement, while B ^ 1 is the average probability of 
misidentifying a state within the qubit subspace as the 
| 2 ) state), then Eqs. (l)-(4) remain valid for p| 2 )(m), but 
with the slightly changed rates: 7 t ^ 7 t = 

7t ^ 7t ~ r “ 7t’ f = T- Therefore, the rates 7 ^^ and 74 ^ 
extracted from the RB results, may slightly differ from 
the actual rates 74 ^ and 74 ,. 

Next we discuss the applicability of the rate equation 
( 1 ) for the 1 2 ) state population. A rate equation usually 
assumes incoherent processes. However, in our case both 
coherent and incoherent processes are important: while 
the rate 74 , is mostly determined by incoherent energy 
relaxation, the rate 7 ^ is mostly determined (at least for 
short gates) by a unitary evolution, though with possibly 
fiuctuating pulse shapes. Therefore, it is not obvious 
if the simple rate equation is applicable. Note that we 
do not apply random ±1 pulses for the | 2 ) state as was 
suggested [1-3] for formal randomization of the coherent 
processes. In our opinion, for practical purposes it is 
not necessary because of different transition frequencies 
co ’21 and c^io- To illustrate this argument, let us assume 
only coherent excitations of the | 2 ) state and consider 
the evolution of the wavefunction co| 0 ) -h ci|l) -h C 2 | 2 ) in 
the rotating frame based on ujiq. Then for a particular 
sequence of Cliffords (assuming |c 2 p ^ 1) 

C2(m) = C2(0) + W, (5) 

where the complex number is the contribution from 
kth Clifford in the sequence ( 7 ^ = is the 

start time of kth Clifford. Eor UJ 21 —c^io = 27r x —212 MHz 
and elementary gate time > 10 ns, it is unlikely that 
the phase shifts in Eq. (5) are close to exact integers 
of 27r. Therefore, even if averaging over Cliffords and 
initial states does not provide full randomization in the 
sense that (g^^k) 7 ^ 0 , the extra phase factor (accumulat¬ 
ing with k) helps to average the contributions to zero, so 
that in this example |c 2 (m)p oc m (from two-dimensional 
random walk), as would also be expected from a sim¬ 
ple rate-equation model. Thus, we expect that the rate 
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Figure 1. Phase space points corresponding to the qubit being 
prepared in the |0) (blue), |1) (red), and |2) (green) states. 
Out of a total of 50,000 preparations of each state, 5000 are 
shown here. The states are discriminated based on their dis¬ 
tance from the center of the cloud corresponding to each state. 
Points of one color positioned in a cloud of a different color 
indicate readout errors. The white circles in each cloud have 
radii corresponding to one standard deviation of the complex 
data in each cloud. 


the three states 50,000 times and measure. The raw IQ 
points of the demodulated signal [6] are shown in Fig. SI. 
The probability of measuring the qubit in each state given 
preparation in another is as follows: 


/ 0.993 0.0069 5 x 10“^ \ 
0.055 0.945 5 x 10“^ 

\ 0.0246 0.083 0.892 J 


where the row indicates the state prepared and the 
columns indicate the state measured. The primary source 
of error is Ti decay of the excited states. The readout fre¬ 
quency was chosen to maximize the separation between 
the |2) state and the |1) state, resulting in a separation 
error between the two clouds of phase space points of 
around 1 x 10“^. However, the actual probability of 
preparing |1) and measuring |2) is greater, at around 
5 X 10“^. This is consistent with the heating rate of 
4 X 10“^ per nanosecond as measured in the main paper, 
multiplied by the readout time of 1 /is. 

In general, we do not correct for measurement fidelity 
except in the thermalization measurement shown in Fig. 4 
of the main article. As noted above, the extraction of 
leakage rates from RB data is affected by readout fidelity. 
Thus, the leakage rates we quote are about 10% lower 
than the actual leakage rates. 


equation should work well for coherent contributions to 
the leakage, and since it also works for incoherent pro¬ 
cesses, we expect the rate equation to be well applicable 
to our RB procedure. Experimental results presented in 
the main text confirm this expectation. 

MEASUREMENT SETUP 

The measurement setup is largely as described in the 
supplementary information for Ref. 4, with two primary 
differences. First, the qubits are no longer statically bi¬ 
ased with a programmable voltage source separate from 
the Z-control DAC. Instead they are operated by inter¬ 
nally adding a DC offset to the output of the control 
DAC. As such, the bias tees and attenuators on the Z- 
control lines at the 20 mK stage were removed. Second, 
the thermalization of all lines was improved by clamping 
the lines to all stages from 4K to 20 mK using copper 
thermal anchors [5]. 

STATE DISCRIMINATION 

Readout parameters for this device have previously 
been detailed in Ref. 4. At the operating point used for 
the experiment, we find the dispersive shift to be about 
1 MHz. We readout using a 1 /rs pulse. To character¬ 
ize our readout fidelity, we prepare the qubit in each of 


DEPENDENCE OF OPTIMAL PULSE 
DETUNING ON DRAG WEIGHT AND PULSE 
LENGTH 

In Fig. 2(a) we show the dependence of the optimal 
pulse detuning on the DRAG weight a for three different 
TT-pulse lengths. For each pulse length, the dependence is 
linear, and the slope becomes more shallow with longer 
pulse length. In Fig. 2(b), we plot the dependence of this 
slope on pulse length. We find that the slope between 
optimal detuning and DRAG is proportional to the in¬ 
verse square of the pulse length. Equivalently stated, the 
slope depends quadratically on the drive strength, which 
we expect because the AC Stark shift scales quadratically 
with the strength of the driving field. 


LEAKAGE STATE DECAY 

Equation 2 contains both a leakage rate and a decay 
rate of the |2) state back into the computational sub¬ 
space. We show in Eig. 3 the decay rates corresponding 
to the data in Eig. 3(a) of the main paper. The dashed 
line represents the decay expected due to Ti decay of the 
|2) given an average Clifford time of tcufford = 18.75 ns. 
The Ti for the |2) we use here is 13 /rs as measured con¬ 
currently with the RB data. We note that this is a dif¬ 
ferent from the 18/rs quoted in the context of Eig. 4 of 
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CK - DRAG Weight 



Pulse Length [ns] 


Figure 2. (a) Dependence of the optimal detuning on ol. 

Three different pulse lengths are shown. The dashed lines 
are linear fits, (b) The slopes from the linear fits as shown in 
(a), for a range of pulse lengths. The dashed line is a fit to 
the inverse square of the pulse length, as expected from the 
AC Stark shift. 


the main paper because these measurements were per¬ 
formed many days apart. Over that time scale, the fine 
features of the spectrum of two-level state (TLS) defects 
tend to drift. In general, the decay rates are higher than 
expected from T\ decay. 


RAW DATA FOR SIMULTANEOUSLY 
OPTIMIZED FIDELITY AND LEAKAGE 

In Fig. 4, we show the raw randomized benchmarking 
data for 10 ns pulses simultaneously optimized for fidelity 
and leakage, as described in Fig. 3(b) of the main article. 
Here, a =1.4, and 5f = —30 MHz. 


THERMALIZATION AT THE 1 2 TRANSITION 

FREQUENCY 



Figure 3. Decay probability of the |2) state per Clifford mea¬ 
sured using RB, corresponding to Fig. 3(a) of the main paper. 
The dashed line indicates the expected incoherent decay from 
the measured Ti. 



0 200 400 600 

m - Number of Cliffords 


To verify the heating rate measured in Fig. 4 of the 
main article, we bias the qubit so that the Oe^l tran¬ 
sition frequency is equal to the original le^2 frequency, 
which was about 5.1GHz. We measure the Ti of the 


Figure 4. Raw randomized benchmarking data for pulses opti¬ 
mized for both gate fidelity and leakage, (a) Sequence fidelity 
decay. The error per Clifford is 8.7 ± 0.4 x 10“^. (b) Leakage 
accumulation. The leakage per Clifford is 1.2 ± 0.1 x 10“^. 
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Figure 5. Heating of the qubit, measured by heralding the |0) 
state, followed by a variable delay and a second measurement. 
The dashed line is a fit to a rate equation, where the only free 
parameter is the heating rate. 


|1) state at this frequency to be 39 /is, roughly a factor of 
two greater than the measured |2) state Ti of 18 yus, as ex¬ 
pected [7]. Next, we measure the heating rate of the 0^1 
transition by performing two measurements separated by 
a variable delay time, as shown in Fig. 5. The first mea¬ 
surement heralds the |0) state to ensure the qubit is in 
|0) at t = 0, and the second measurement probes the ap¬ 
proach of the qubit to the equilibrium population. We 
fit to a rate equation with two rates, the heating rate 
and the Ti decay rate; with the Ti fixed by the previous 
measurement, we fit the heating rate to be 1/(4.7ms). 
Again, we find the heating time constant to be roughly 
a factor of 2 larger than that of the |2) state, which we 
measured to be 2.2 ms. 



Figure 6. Suppressing leakage using second derivative DRAG, 
(a) Leakage rate extracted from full Clifford based RB vs 
DRAG weighting (oi and 02 ), using first derivative correc¬ 
tion (red) and second derivative correction (black). Data is 
for 10 ns pulses, (b) Leakage performance when using both 
first and second derivative DRAG. The color corresponds to 
the 12) state population after 700 Cliffords, and is the aver¬ 
age of 45 different random sequences. The scale of the color 
is logarithmic. The dashed, horizontal red line corresponds 
to first derivative correction only while the vertical black line 
corresponds to second derivative correction only. The open 
circle highlights the minimum leakage population, which was 
3 X 10“^ 


DRAG WITH SECOND DERIVATIVE 
CORRECTION 

Reference 8 notes that for long pulses and large anhar- 
monicity, leakage can be suppressed using a DRAG-like 
technique with higher order derivatives. For example, 
DRAG correction with the second derivative takes the 
following form: 

n'{t) = m + ( 6 ) 

where 0^2 is a weighting parameter. Note that unlike 
DRAG with first derivatives, the second derivative cor¬ 
rection is applied to the in-phase component, which 
means that it does not have any effect on phase errors. 
We perform the same experiment as in Fig. 3(a) of the 


main paper to compare first and second derivative DRAG 
correction for 10 ns pulses without any detunings. As 
seen in Fig. 6(a), the second derivative correction does 
indeed suppress leakage, with a minimum leakage rate of 
5 X 10“^ at 0^2 = 1.3. However, the first derivative cor¬ 
rection is still more effective by about a factor of 3 when 
optimized. Next, we implement both first and second 
derivative corrections simultaneously. 

Because we have increased the dimension of our param¬ 
eter space, performing full RB characterization by mea¬ 
suring leakage population versus sequence length for each 
set of parameters would take a prohibitively long time. 
Instead, we measure the leakage population for many 
random sequences but only for a single, large sequence 
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length. We aim to measure the leakage state population 
near saturation, which is correlated with the leakage rate 
if the decay rate of the |2) state is mostly independent 
of the parameters under consideration. In Fig. 6(b), we 
show the 12) state population after 700 Clifford gates, av¬ 
eraged over 45 different random sequences, while varying 
both the first and second derivative DRAG weights. We 
see that there is a substantial parameter space over which 
leakage can be suppressed. We obtain a minimum leakage 
population after 700 Cliffords of 3 x 10“^ for ai = 2.8 and 
0^2 = —1.8, which is a factor of 2 improvement over using 
only first derivative correction (e.g. as seen in Fig. 4). 
However, using such a large a would also require a large 
detuning to compensate for phase errors, which will in¬ 
crease leakage. Thus, while our data suggests that there 
are still gains to be made in leakage performance, simul¬ 
taneously optimizing for fidelity and leakage while using 
second derivative DRAG is non-trivial and an ongoing 
topic of research. 
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